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Abstract — We propose in this work a new handwritten digit 
recognition system based on parallel combination of SVM 
classifiers for managing conflict provided between their 
outputs. Firstly, we evaluate different methods of generating 
features to train the SVM classifiers that operate 
independently of each other. To improve the performance of 
the system, the outputs of SVM classifiers are combined 
through the Dezert-Smarandache theory. The proposed 
framework allows combining the calibrated SVM outputs 
issued from a sigmoid transformation and uses an estimation 
technique based on a supervised model to compute the belief 
assignments. Decision making is performed by maximizing the 
new Dezert-Smarandache probability. The performance 
evaluation of the proposed system is conducted on the well 
known US Postal Service database. Experimental results show 
that the proposed combination framework improves the 
recognition rate even when individual SVM classifiers provide 
conflicting outputs. 

Keywords-Handwriting digit recognition; Support Vector 
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I. Introduction 

The initial systems that have emerged in the optical 
character recognition (OCR) are the systems for reading 
postal addresses used for mail sorting, automatic reading of 
handwriting on the forms, etc. Despite the researches in this 
area, the recognition of handwriting remains an open and 
important problem. 

The basic task of such system is the recognition of 
isolated handwritten digits, the idea is to focus on only one 
digit at a time. This method leads to several constraints such 
as variability in the size of digits that can occur even among 
the digits of the same class, the difference in writing between 
individuals, the complexity of the separation between the 
digit and background, the thickness of the writing, the 
inclination angle. All these parameters are variables which 
makes this task complex and difficult. 

In fact, these constraints lead to develop a large number 
of classifiers and methods of generating features. Rather than 
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trying to optimize a single classifier by choosing the best 
features for a given problem, researchers found more 
interesting to combine the recognition methods [1], [2]. 
Indeed, the combination of classifiers allows exploiting the 
redundant and complementary nature of the responses issued 
from different classifiers. 

However, with the existence of the constraints mentioned 
before, an appropriate operating method using mathematical 
approaches is needed, which takes into account two notions: 
uncertainty and imprecision of the responses of classifiers. 

In general, the non-probabilistic approaches such as 
Support Vector Machines (SVMs) [3], are able to represent 
the uncertain knowledge but are unable to model easily the 
information which is imprecise, incomplete, or not totally 
reliable. Moreover, they often lead to confuse both concepts 
of uncertainty and imprecision with the probability measure. 
Indeed, the modelling through these approaches allows the 
reasoning only on singletons, which represent the different 
hypotheses (classes), under the closed world assumption. 
Therefore, several theories for modelling both concepts of 
uncertainty and imprecision have been introduced [4], [5], 
[6], [7]. 

Researchers have proposed various approaches for 
combining classifiers increasingly numerous and varied, 
which led the development of several schemes in order to 
treat data in different ways [1], [2]. Generally, three 
approaches for combining classifiers can be considered: 
parallel approach, sequential approach and hybrid approach 
[1], [2]. Furthermore, these ones can be performed at a class 
level, at a rank level, or at a measure level [8], [9], [10]. In a 
class level combination, the opinion of the classifier is 
binary. We can then represent the response of classifier 
through a binary vector in which “1” indicates the proposed 
class by the classifier. A classifier can also produce a set of 
classes. It then considers a pattern belongs to a class of this 
set without giving other information, which allows 
discriminating between classes. A rank level combination 
performs a ranking on the classes. The classifier indicates the 
ranking by providing in the output a vector of ranks. The 
class placed at the first rank of the list by the classifier is 
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considered as the most probable for a given pattern and the 
class of last rank is the less probable one. A measure level 
combination indicates the confidence factor of the classifier 
in its proposal. The output of the classifier is a vector of 
measures (normalized or not), which may be a distance, a 
posterior probability, a confidence value, a match score, 
belief function, a possibility, credibility or a fuzzy measure, 
etc. 

In this research, we focus on parallel combination to 
efficiently combine two SVMs classifiers at measure level. 
Therefore, the combination framework that we propose in 
the context of recognition of isolated handwritten digits is 
based on Dezert-Smarandache theory (DSmT). We first 
evaluate different methods of generating features to train the 
SVMs classifiers that operate independently of each other. 
The outputs of SVMs classifiers provide the degrees of 
imprecision for the recognition task. We then transform these 
ones in posterior probabilities using a sigmoid 
transformation. Hence, in order to enhance the performances 
of handwritten digit recognition system, we propose a 
supervised model based on DSmT for managing significantly 
the conflict provided from the two SVMs classifiers. 

The paper is organized as follows. We give in section 2 a 
review of Proportional Conflict Redistribution (PCR6) rule 
based on DSmT. In section 3, we present the description of 
proposed recognition system. Experiments conducted on the 
USPS database of isolated handwritten digits are presented 
in section 4. The last section gives a summary of the 
proposed combination framework and looks to the future 
research direction. 

II. Review of PCR6 Combination Rule 

Generally, the handwritten digit recognition is 
formulated as a ten-class problem where classes are 
associated to handwritten Arabic digits classes, namely 
# 0 , , . . . , and 0 9 . Hence, the parallel combination of two 

classifiers, namely information sources Sj and S 2 , 
respectively, is performed through the PCR6 combination 
rule based on the DSmT. For ten-class problem, a reference 
domain also called the frame of discernment should be 
defined for performing the combination, which is composed 
of a finite set of exhaustive and mutually exclusive 
hypotheses. 

In the context of the probabilistic theory, the frame of 
discernment, namely 0 , is composed of ten elements as: 
© = {6' 0 ,6» 1 , and a mapping function m e [0,1] is 
associated for each class, which defines the corresponding 

mass verifying m\ o)=0 and ^ w(^-) = 1 . When 

combining two sources of information, the combination rule 
defined in Bayesian framework [1 1], the weighted mean and 
consensus based methods [12], [13], [14] seem effective for 
non-conflicting responses. In the opposite case, an 
alternative approach has been developed in DSmT 
framework to deal with (highly) conflicting imprecise and 



uncertain sources of information [7]. Example of such 
approaches is PCR6 rule. 

The main concept of the DSmT is to distribute unitary 
mass of certainty over all the composite propositions built 
from elements of 0 with u (Union) and n (Intersection) 
operators instead of making this distribution over the 
elementary hypothesis only. Therefore, the hyper-powerset 

D 0 is defined as: 

1. 0,0 O ,e u -A^D @ . 

2. If A,B e D 0 , then A n B e D 0 and A u B e D 0 . 



3. No other elements belong to D® , except those 
obtained by using rules 1 or 2. 



The DSmT uses generalized basic belief mass, also 
known as the generalized basic belief assignment (gbba) 
computed on hyper-powerset of 0 and defined by a map 

m(.):D 0 — > [o,l] associated to a given source of evidence 
which can support paradoxical information, as follows: 
m( o) = 0 and m(A) = 1 . The combined masses 



m pcr6 obtained from m^.) and m 2 ( .) by means of the 
PCR6 rule [7] is defined as: 

m PCR6 ( A i ) “ 

o if ago, 

(i) 

m A ( A i) + 'Yj m k( A i) L k Otherwise. 

k = 1 



Where 



m a k (\) Y a k (l) 



( 2 ) 



7 ajt(1) n4eO m k( A i) + m a k {l) Y a k (\) 

0={0 M ,o} is the set of all relatively and absolutely 



empty elements, O m is the set of all elements of D 0 
which have been forced to be empty in the hybrid model M 
defined by the exhaustive and exclusive constraints, o is 
the empty set, the denominator mk( A i) + m cj k (\)Y(j k (\) * s 

different to zero, and where <J k ( 1) counts from 1 to 2 



avoiding k , i.e.: cr 1 (l) = 2 and 02(1) = 1. Thus, the term 
m A (Aj) represents a conjunctive consensus, also called 
DSm Classic (DSmC) combination rule [7], which is 
defined as: 



mAA)-- 



0 



if A i = o, 



/ m x (X) m 2 (Y) otherwise. 

(x,YeD @ ,Xr^Y=A i ) 



( 3 ) 
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III. System Description 

The system shown in Fig. 1 is composed of two 
individual systems using SVMs classifiers, which are 
combined through the PCR6 rule. In the following, we give 
a description of each module composed our system. 




Figure 1. Structure of the recognition system. 



A. Pre-processing 

The acquired image of isolated digit should be processed 
to facilitate the feature generation. In our case, the pre- 
processing module includes a binarization step using the 
method of Otsu [15], which eliminates the homogeneous 
background of the isolated digit and keeps the foreground 
information. 

B. Feature Generation 

The objective of the feature generation step is to 
underline the relevant information that initially exists in the 
raw data. Thus, an appropriate choice of the descriptor 
improves significantly the accuracy of the recognition 
system. In this study, we use a collection of popular feature 
generation methods, which can be categorized into 
background features [16], [17], foreground features [16], 
[17], geometric features [2], and uniform grid features [18], 
[19]. 

C. Classification Based On SVM 

Currently, SVMs are widely used in many pattern 
recognition applications as the handwritten digit recognition 
[2]. Its concept is based on the underlying structural risk 
minimization principle [3]. They proceed by mapping data 
into a high dimensional dot product space via a kernel 
function. In this space an optimal hyperplane, that 
maximizes the margin of separation between the two 
classes, is calculated. 



Let D a set of N learning samples which are separable 
in n classes [0 {) ,0 x ,...,0 n _ x ] , such that 

D = iwiV e = !,•••, N,j ; e{0,l, . In this 
paper, the combination of binary SVMs is performed using 
the multi-class implementation based on One Against All 
(OAA) method [20], in which each SVM is designed to 
separate a class from all the others ( n SVMs are performed 
to solve a n -class problem). Thus, to solve a handwritten 
digit recognition problem, 10 SVMs trained over the full 
database are required. 

D. Classification Based On DSmT 

The proposed combination module consists of three 
steps: i) transformation of the SVM outputs into belief 
assignments using estimation technique based on a 
calibration method and a supervised model, ii) combination 
of masses through a combination rule and iii) decision rule. 



1) Estimation of Masses: In this paper, the SVM outputs 
are calibrated using a sigmoidal transformation of Platt [21], 
and the masses of simple classes and their complementary 
classes are estimated using a supervised model, respectively. 
Let note m x {.) and m 2 (.) the gbba provided by two distinct 
information sources S x (First descriptor) and S 2 (Second 
descriptor), F is the set of focal elements for each source, 
such that F = {<9 0 , 0 X , . . . , 0 n _ x , Go , 0 \ , . . . , G n -\ } , the classes 
0 l are separable (One relatively to its complementary class 

Oi ) using the SVM multi-class implementation (OAA): 
they correspond to different singletons of the handwritten 
digits assumed to be known. Therefore, each compound 
element A t £ F has a mass m x equal to zero, on the other 

hand, the mass of the complementary element Oi = IK 

0< j<n— 1 

j*i 

is different from zero, which represents the mass of the 
partial ignorance. The same reasoning is applied to the 
classes issued from the second source S 2 and m 2 (.) . Hence, 
both gbba m x (.) and m 2 (.) are given as follows: 






€ F, 



m b 






]*i 



, V6>, e F, 



,(4) = 0, V.4, e <D = D e \F. 



(4) 

(5) 

( 6 ) 



where Z b = ^ ^ P h [Oj |x) represent normalization factors 
that are introduced in the axiomatic approach in order to 



respect the mass definition, P b are the posterior 
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probabilities issued from the first source (b = l) and the 
second source (b = 2) respectively. They are given for a test 
digit x as follows [21]: 

A {&: lx) = 7 l - ^ V. (7) 

1 + exp (A ib xf ib (x)+B ib ) 

A ib and B ib are the parameters of the sigmoidal function 
tuned by minimizing the negative log-likelihood of the 
learning samples for each class of digits G i , and f ib (x) is 
the i -th output of binary SVM classifier issued from the 
source S b , such that / = 0, 1, . . . , n - 1 and b e {l, 2} . 

2) Combination of Masses: In order to manage the 
conflict generated from the two information sources S x and 
S 2 (i.e. both SVM classifications), the combined masses are 
computed as follows: 

m c = m x ®m 2 . (8) 

where © defines the PCR6 combination rule. 

3) Decision Rule: A decision of membership of a 
handwritten digit to one of the simple classes of 0 is made 
using the statistical classification technique. First, the 
combined beliefs are converted into probability measure 
using a new probabilistic transformation, called Dezert- 
Smarandache probability (DSmP), that maps a belief 
measure to a subjective probability measure [7] defined as: 

DSmP e (Gj ) = 

mM)+ (9) 

A f& £ m c( A i) + £ c u( A j) 

AjZiOf A k e2 e 
C M (Aj)>2 A k (zAj 

C m (A )= 1 

where / = {0,1,...,9}, £ > 0 is a tuning parameter, M is the 
Shafer’s model for 0 , and C M (A k ) denotes the DSm 
cardinal [7] of A k . Therefore, the maximum likelihood 
(ML) test is used for decision making as follows: 



xe 6 l if DSmP £ (G i ) 



maxi DSmP £ (0, ), 0 < j < 9 >. (10) 



where x is the handwritten digit test characterized by both 
descriptors, which are used during the feature generation 
step, and £ is fixed to 0.001 in the decision measure given 
by (8). 

IV. Experimental Results 

A. Database Description and Performance Criteria 

Experiments are conducted on the well-known US Postal 
Service (USPS) handwriting recognition task. This database 



contains normalized grey-level handwritten digit images of 
10 numeral classes, extracted from US postal envelopes. 
All images are segmented and normalized to a size of 
16x16 pixels. There are 7291 training data and 2007 test 
data where some of them are corrupted and difficult to 
classify correctly. For evaluating the performances of the 
handwritten digit recognition system, a popular error is 
considered, which is the Error Rate (ER) for each class and 
Mean Error Rate (MER) for all classes. 

B. SVM Model Used for Validation 

The SVM model is produced for each class according 
the used descriptor. Hence, the training dataset is partitioned 
into two equal subsets of samples: the first one is the 
learning subset used to learn each binary SVM classifier and 
the second one is the validation subset. Thus, the validation 
phase allows finding the optimal hyperparameters for the 
ten SVM models. In our system, the RBF kernel is selected 
for the experiments. Indeed, the regularization and RBF 
kernel parameters (C, a) of each SVM are tuned 
experimentally at the time of learning phase, in such way 
that the misclassification error of data in the learning subset 
is zero and the validation test gives a minimal error during 
validation phase for each SVM separating between a simple 
class and its complementary class. 

C. Recognition Results and Discussion 

The test phase has been performed using all samples 
from the test dataset. Hence, the performance of the 
handwritten digit recognition system will be evaluated on an 
appropriate choice of descriptors using the SVMs classifier 
and then we evaluate the combination of the SVMs 
classifiers through DSmT framework. 

1) Performance Evaluation of the Proposed 
Descriptors: In these experiments, we compute the test error 
rate of the SVMs classifier using Foreground Features (FF), 
Background Features (BF), Geometric Features (GF), 
Uniform Grid Features (UGF), and the descriptors which 
result from a concatenation between at least two simple 
descriptors such as (BF,FF), (BF,FF,GF), and the 
(UGF,BF,FF,GF) descriptor. Indeed, the experiments have 
shown that the appropriate choice of both descriptors and 
concatenation in order to represent each digit class in the 
feature generation step provides an interesting recognition 
performance. In table 1, FF and UGF-based descriptors 
using SVM classifiers are evaluated. When using (BF,FF)- 
based descriptors, we observe a significant improvement in 
the recognition performance when we concatenate 
background and foreground features in the same vector, 
respectively. In fact, a gain of 6.71% in the error rate has 
been obtained using the new (BF,FF)-based descriptor. A 
reduction of 1.5% in the error rate is obtained in the 
experiment (c) for the new (BF,FF,GF)-based descriptor, 
which is constructed by a concatenation of (BF,FF)-based 
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descriptor and geometric features in the same vector, 
respectively. 

Furthermore, UGF -based descriptor yields a recognition 
error of 6.98% which 3.68% less than the recognition error 
of (BF,FF,GF)-based descriptor. Finally, the combination of 
UGF and (BF,FF,GF)-based descriptors through a 
concatenation allows decreasing the recognition 
performance, which is expressed by an increase of 2.73% in 
the error recognition. 

Table 1 . Mean Error Rates of The SVM Classifiers Using 
Different Methods of Feature Generation 





MER (%) 


(a) FF 


18.87 


(b) (BF,FF) 


12.16 


(c) (BF,FF,GF) 


10.66 


(d) UGF 


6.98 


(e) (UGF,BF,FF,GF) 


9.71 



As we can see, it is difficult to improve the recognition 
performance by a concatenation of features since most of 
the time the combined descriptors does not take into account 
the complementary, which can be exist between both 
descriptors. 

Hence, we propose a combination of SVMs classifiers 
based on DSmT for a better exploitation of the 
complementary, which is obtained from the descriptors. In 
this way, it is possible to improve the recognition 
performance when the concatenation of descriptors can fail 
to provide the correct solution for some specific handwritten 
digit recognition problems. 

2) Performance Evaluation of the Proposed 
Combination Framework: In these experiments, we evaluate 
a handwritten digit recognition system based on a 
combination of SVMs classifiers through DSmT. In fact, the 
proposed combination framework allows to exploit the 
redundant and complementary nature of the (BF,FF,GF) and 
UGF-based descriptors and manage the conflict provided 
from the outputs of SVMs classifiers. 

Decision making will be only done on the simple classes 
belonging to the frame of discernment. Hence, we consider 
in both combination process and calculation of the decision 
measures the masses associated to all classes representing 

the partial ignorance 6 t = \J°j and 0 i n 0 ■ such that 

0< j<n — 1 

j*i 

i*j- 

For better comparison, table 2 shows the mean 
recognition rate computed separately on test samples 
belonging to each simple class using the SVM classifiers 
and PCR6 combination rule. Therefore, results 
corresponding to the error rates are determined and given in 
the last line of table 2 for each algorithm. 



As shown in table 2 the PCR6 algorithm yield in the 
case of the digits belonging to 0 6 a recognition rate of 
96.47%, which is 1.76% greater than the recognition 
accuracy of SVMs classifier trained with UGF-based 
descriptor, but it is less than 0.59% compared to the 
recognition accuracy obtained when training the SVMs 
classifier with (BF,FF,GF)-based descriptor. This is because 
there are some digits of the class 0 6 which are wrongly 
characterized by both UG and (BF,FF,GF)-based 
descriptors. In other words, the PCR6 combination based 
algorithm is not reliable when the complementary 
information provided from both descriptors is wrongly 
preserved. 

Except the samples belonging to class 0 6 , the PCR6 
combination based algorithm kept the same recognition 
performance when considering the best individual SVMs 
classifier trained with UGF-based descriptor and taking into 
account the samples belonging to O 0 and 0 3 , and it 
improves the recognition accuracy when considering other 
samples belonging to classes 0 X , 0 2 , 0 A , 0 5 , 6 n , 0 8 , and 

Therefore, the proposed framework with PCR6 
combination rule yields a recognition error of 5.43% 
corresponding to a decrease of 1.55%. This is because the 
efficient redistribution of the partial conflicting mass only to 
the elements involved in the partial conflict when using 
PCR6 combination rule. 

V. Conclusion And Future Work 

We proposed and presented a new system which allows 
improving the handwritten digit recognition performance by 
combining the outputs issued from two SVM classifiers. 
The proposed parallel combination is performed through 
DSmT framework using an estimation technique based on 
sigmoid transformation and supervised model, PCR6 
combination rule and DSmP based maximum likelihood 
(ML) test. Experimental results show that the proposed 
combination framework with PCR6 rule yields the best 
recognition accuracy even when the individual SVMs 
classifications provide conflicting outputs. 

In continuation to the present work, the next objectives 
consist to incorporate two complementary descriptors using 
the same proposed handwritten digit recognition system in 
order to attempt to reduce the MER. 
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Table 2. Mean Error rates of the proposed framework with PCR6 combination algorithm using BF-FF-GF and UGF descriptors 



Class 


(BF,FF,GF)+SVMs 


UGF+SVMs 


PCR6 Combination Rule 


0 


93.31 


98.05 


98.05 


1 


95.45 


96.21 


96.97 


2 


87.37 


91.92 


93.94 


3 


82.53 


89.16 


89.16 


4 


80.00 


88.50 


91.00 


5 


83.13 


90.00 


92.50 


6 


97.06 


94.71 


96.47 


7 


91.16 


91.84 


95.24 


8 


87.95 


89.16 


93.37 


9 


89.27 


93.79 


94.35 


MER (%) 


10.66 


6.98 


5.43 



246 





